Skip to content

fix(security): harden Milady Docker provisioning against env injection#400

Open
0xSolace wants to merge 5 commits intodevfrom
fix/security-provisioning-rce
Open

fix(security): harden Milady Docker provisioning against env injection#400
0xSolace wants to merge 5 commits intodevfrom
fix/security-provisioning-rce

Conversation

@0xSolace
Copy link
Collaborator

Summary

This PR hardens the Milady Docker provisioning path against command injection in the remote docker run flow.

Security impact

This addresses a critical authenticated RCE risk in provisioning: user-controlled environment variables were being assembled into a remote shell command for docker run. While values were shell-quoted, validation was incomplete and the provisioning path still relied on shell interpolation for multiple user-derived fields.

Changes

  • add explicit validation helpers for:
    • env keys: ^[A-Z_][A-Z0-9_]*$
    • env values: reject null bytes and control characters
    • container names
    • volume paths
  • enforce container-name and volume-path validation before remote execution
  • replace ad-hoc env-key validation in docker-sandbox-provider with centralized helpers
  • expand unit coverage for the new validation rules and provisioning invariants

Notes

  • This intentionally tightens accepted environment variable input for Docker-backed Milady provisioning.
  • Lowercase env names and multiline/control-character values now fail closed instead of being passed into the remote shell execution path.
  • This is a deliberate security tradeoff for the Docker provider.

Validation

  • bun test packages/tests/unit/docker-infrastructure.test.ts
  • bun test packages/tests/unit/milady-create-routes.test.ts
  • codex review --uncommitted (flagged the tighter env validation as behavior-changing; retained intentionally for security hardening)

Risk

High security value, low code-surface change. Main behavior change is stricter validation of user-supplied env vars during Docker provisioning.

@vercel
Copy link

vercel bot commented Mar 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
eliza-cloud-v2 Ready Ready Preview, Comment Mar 22, 2026 9:01am

@coderabbitai
Copy link

coderabbitai bot commented Mar 20, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 02f35979-3834-4763-b49f-affb6005efff

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/security-provisioning-rce

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can validate your CodeRabbit configuration file in your editor.

If your editor has YAML language server, you can enable auto-completion and validation by adding # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json at the top of your CodeRabbit configuration file.

@claude
Copy link

claude bot commented Mar 20, 2026

PR Review — fix(security): harden Milady Docker provisioning against env injection

The core security improvement (centralized uppercase-only key validation + control-character rejection) is sound and worthwhile. Several issues should be addressed before merging.


Bugs / Correctness

1. Duplicate validation in docker-sandbox-provider.ts:264-265

getContainerName() and getVolumePath() already call their respective validators internally. The explicit calls immediately after are dead code and create confusion about where validation is authoritative:

const containerName = getContainerName(agentId);  // already validates internally
const volumePath = getVolumePath(agentId);          // already validates internally
validateContainerName(containerName);               // duplicate — remove
validateVolumePath(volumePath);                     // duplicate — remove

2. validateEnvValue rejects newlines — may break legitimate multi-line values

docker-sandbox-utils.ts:80 rejects [\x00-\x1f\x7f] which includes \n. This could break callers passing ...environmentVars with PEM certificates or other base64-encoded blobs containing embedded newlines. If this is intentional, the error message should name the offending key and explain the restriction:

// Current — hard to diagnose in production:
throw new Error(`Invalid environment variable value: contains control characters.`);

// Better:
throw new Error(`Invalid environment variable value for key "${key}": contains control characters (newlines and PEM-encoded values are not supported).`);

3. validateVolumePath allows trailing slashes

docker-sandbox-utils.ts:94: the regex permits paths like /data/agents/ (trailing slash), which produces a different path than /data/agents and could cause subtle issues if paths are later compared or concatenated.


Security

4. TS_EXTRA_ARGS value "--accept-routes" is safe, but the coupling is fragile

vpnEnvVars keys with lowercase letters or hyphens would fail validateEnvKey and cause an opaque provisioning error. Consider exempting internally-generated env vars from the key validator, or documenting that all internal env keys must be UPPER_SNAKE_CASE.


Design

5. Validation inside factory functions breaks the cleanup path

getContainerName() and getVolumePath() now throw on validation failure. These are called from the error-recovery path in create(). If agentId is somehow invalid at cleanup time, the validation error would mask the original failure and the ghost container would not be cleaned up.

6. Removed comment was architecturally informative

The removed comment block (explaining why shellQuote() makes value injection safe) has no replacement in validateEnvValue's JSDoc. Future maintainers won't understand the security model without it.


Test Coverage Gaps

  • No test for validateContainerName at the 128-char boundary
  • No test for validateVolumePath with a trailing slash
  • No test for validateEnvValue with real-world values: UUIDs, URLs, base64 with +///=
  • No end-to-end test through _createOnce() confirming a lowercase caller-supplied key is rejected
  • milady-create-routes.test.ts is referenced in the PR description but has no changes — appears to be a stale reference

Nits

  • docker-sandbox-utils.ts:87 JSDoc says "simple shell-safe identifiers" but the regex also allows . and -
  • docker-sandbox-utils.ts:94 character class [A-Za-z0-9._/-] — hyphen should be escaped or placed at start/end of class to avoid range ambiguity

Summary

Issue Severity
Duplicate validateContainerName/validateVolumePath calls Low (dead code)
Multi-line env values silently rejected, error not debuggable Medium
Trailing slash passes validateVolumePath Low
Validation in factory breaks error-recovery cleanup path Low (design)
Removed informative comment not replaced Low
Missing boundary/edge-case tests Low

Actionable before merge:

  1. Remove the duplicate validation calls in docker-sandbox-provider.ts:264-265
  2. Document (and surface in the error message) that multi-line values are intentionally rejected, including the offending key name
  3. Add missing boundary and edge-case tests

@claude
Copy link

claude bot commented Mar 20, 2026

Code Review

Overall this is a solid, well-scoped security hardening PR. The security intent is sound and the implementation is clean. A few issues worth addressing before merging:


Bug: Container name length cap is inconsistent with validateAgentId

validateAgentId permits agent IDs up to 128 characters, but getContainerName produces milady-${agentId} (prefix is 7 chars), and validateContainerName rejects names longer than 128 characters total. Any agentId longer than 121 characters will cause getContainerName to throw at runtime:

// validateAgentId allows up to 128 chars:
if (!/^[a-zA-Z0-9_-]{1,128}$/.test(agentId)) { ... }

// getContainerName generates a name up to 135 chars:
const containerName = `milady-${agentId}`;  // 7 + up to 128 = 135 chars

// validateContainerName rejects > 128 chars:
if (!/^[a-zA-Z0-9][a-zA-Z0-9_.-]{0,127}$/.test(containerName)) { ... }

In practice agent IDs are UUIDs (36 chars, so total 43) and this won't trigger — but the inconsistency means a valid-per-spec agent ID would silently break provisioning. Either tighten validateAgentId to 121 chars max, or raise validateContainerName to 135 chars, or add a test that documents the effective cap.


Minor: Redundant validation of deterministic output in getContainerName / getVolumePath

Both functions now validate their own deterministic output:

export function getContainerName(agentId: string): string {
  validateAgentId(agentId);               // validates input
  const containerName = `milady-${agentId}`;
  validateContainerName(containerName);   // validates derived output
  return containerName;
}

Since the output is a pure function of already-validated input, the second call adds overhead without catching real bugs — and as noted above, it actively breaks for long-but-valid agent IDs. The value of validating derived output would only be if the template string changed; a code comment explaining why it's there (or removing it) would be clearer.


Minor: vpnEnvVars keys assumed to be uppercase

The validation loop applies validateEnvKey to all entries in allEnv, which includes vpnEnvVars spread from headscaleIntegration.prepareContainerVPN. Looking at headscale-integration.ts:68-74, the current keys (HEADSCALE_URL, TS_AUTHKEY, TS_HOSTNAME, TS_STATE_DIR, TS_EXTRA_ARGS) are all uppercase — so this is fine today. But if someone adds a lowercase key to that map in the future, provisioning will silently fail. A comment near the vpnEnvVars spread would help flag this constraint.


Minor: No test for getContainerName rejecting an agentId > 121 chars

Given the length inconsistency above, a test like this would make the behavior explicit:

test("getContainerName rejects agentId that would exceed 128-char container name limit", () => {
  // agentId of 122 chars: "milady-" + 122 = 129, which exceeds the 128-char cap
  const longId = "a".repeat(122);
  expect(() => getContainerName(longId)).toThrow();
});

Either the test should pass (confirming the bug exists) or validateAgentId should be tightened to prevent it.


Positive observations

  • Centralizing validation is the right call — removing the inline regex and the old rationale comment is cleaner.
  • validateEnvValue catching control characters (including \x7f) is correct and the error message is genuinely useful for debugging PEM/multiline cases.
  • validateVolumePath double-checking normalized forms (//, /./, /../) after the regex is a good defense-in-depth pattern.
  • Test coverage for the new helpers is thorough and clearly structured.
  • The security tradeoff (fail-closed on lowercase env keys / control chars) is well-documented in both the PR description and code comments.

@claude
Copy link

claude bot commented Mar 20, 2026

Code Review

Overall this is a solid security hardening PR — the defense-in-depth approach (validate inputs, then validate derived outputs) is the right call, and the test coverage is good. One real bug and a few smaller items below.


Bug: JS $ anchor allows trailing newlines

All the new validators use $ as the end-of-string anchor, but in JavaScript $ (without the m flag) matches end-of-string OR the position just before a trailing \n. This means:

/^[A-Z_][A-Z0-9_]*$/.test("ABC\n")   // true — newline slips through!
/^[a-zA-Z0-9][a-zA-Z0-9_.-]{0,127}$/.test("milady-agent\n")  // true

validateEnvValue is immune to this because it actively scans the value for [\x00-\x1f] (which includes \n). But validateEnvKey and validateContainerName are not. A key like "MILADY_SECRET\n" would pass validateEnvKey, and the trailing newline would end up in the -e KEY\n=value flag passed to the remote shell.

Fix — the simplest approach is to apply the same control-character scan to keys:

export function validateEnvKey(key: string): void {
  if (!/^[A-Z_][A-Z0-9_]*$/.test(key) || /[\x00-\x1f\x7f]/.test(key)) {
    throw new Error(`Invalid environment variable key "${key}": must match ^[A-Z_][A-Z0-9_]*$.`);
  }
}

Or equivalently, anchor with \n explicitly rejected. The same fix applies to validateContainerName and validateAgentId.


Breaking change: lowercase env keys now rejected at runtime

The old guard was ^[a-zA-Z_][a-zA-Z0-9_]*$; the new one is ^[A-Z_][A-Z0-9_]*$. Any caller passing lowercase or mixed-case keys (e.g. Node_ENV, node_env) will get a runtime throw that previously succeeded. The PR description calls this out as intentional, which is fine — but it would be worth adding a note in the function's JSDoc or a CHANGELOG entry so callers know this is an explicit constraint, not just a regex typo.


validateVolumePath: two-pass structure is correct but worth a comment

The regex ^\/[A-Za-z0-9._/\-]+$ plus the path-normalization checks catches traversal reliably. One subtle case that's handled correctly: /../etc/passwd matches the character-class (. and / are both allowed) but is caught by volumePath.includes("/../"). Good. A brief comment linking the two passes would help future readers understand why both are needed rather than the regex alone.


Minor: validateContainerName error message doesn't include the valid pattern

validateEnvKey tells the caller exactly what pattern is required. validateContainerName just says Invalid container name "X" with no hint. Matching the style of the env-key error would make debugging easier:

throw new Error(`Invalid container name "${containerName}": must match ^[a-zA-Z0-9][a-zA-Z0-9_.-]{0,127}$.`);

Positive notes

  • Double validation in getContainerName / getVolumePath (validate input agentId, then validate the derived output) is a good pattern and the inline comment explaining the guardrail intent is appreciated.
  • Including key in the validateEnvValue error message is exactly right for prod debuggability.
  • Test coverage hits all the meaningful boundaries (boundary-exact lengths, traversal patterns, control chars, shell metacharacters).
  • Removing the now-incorrect comment that claimed shellQuote was sufficient for values is the right call.

@claude
Copy link

claude bot commented Mar 20, 2026

Code Review

Overall this is a well-structured security hardening PR with clear intent and good test coverage. A few issues worth addressing before merge.


Bugs / Correctness

1. validateContainerName: hasControlChars is dead code

// docker-sandbox-utils.ts:107
if (hasControlChars(containerName) || !/^[a-zA-Z0-9][a-zA-Z0-9_.-]{0,127}$/.test(containerName)) {

The character class [a-zA-Z0-9_.-] already excludes all control characters, so hasControlChars can never be the sole reason for rejection here — the regex fires first. This is a minor correctness nit but it implies the function behaves differently from how it reads. Same applies to validateEnvKey (the regex ^[A-Z_][A-Z0-9_]*$ also excludes control chars). hasControlChars being meaningful only in validateEnvValue and validateAgentName makes the pattern inconsistent.

2. Silent max-length reduction for validateAgentId

MAX_AGENT_ID_LENGTH is now 128 - 7 = 121, down from the prior hard-coded 128. Any in-flight agent IDs between 122–128 chars would now fail validation on the next provisioning call. Real UUIDs are 36 chars so this is unlikely to affect production, but it is an undocumented breaking change. Worth a note in the PR or a migration check.

3. validateVolumePath allows non-ASCII Unicode

hasControlChars only covers [\x00-\x1f\x7f]. Unicode characters (e.g. emoji, zero-width spaces, RTL override markers) pass both the control-char check and the regex ^\/[A-Za-z0-9._/\-]+$ — wait, actually the regex does restrict to ASCII alphanumeric/dot/slash/hyphen, so Unicode would fail the regex. This is fine.

However, validateEnvValue has no such character restriction — only control chars are blocked. A value containing Unicode homoglyph substitutions or invisible Unicode markers will pass. This is probably acceptable for env values (keys/passwords legitimately contain Unicode), but worth noting.


Security

4. resolvedImage is not validated

// docker-sandbox-provider.ts:334
shellQuote(resolvedImage),

Docker image names with shellQuote are safe from shell injection, but there's no format check to prevent a caller from specifying an image from an arbitrary registry (e.g. attacker.example.com/malicious:latest). This is a pre-existing issue and out of scope for this PR, but worth tracking separately.

5. Validation placement in getContainerName / getVolumePath

The double-validation (calling validateAgentId then validateContainerName/validateVolumePath on the derived output) is correct and a good defense-in-depth pattern. One clarification: since validateAgentId guarantees ^[a-zA-Z0-9_-]{1-121}$, the container name milady-{agentId} is guaranteed to satisfy validateContainerName — the second check is purely a guardrail against template changes. The comment already says this, which is good.


Test Coverage Gaps

6. Missing edge case: empty string env key

validateEnvKey("") should throw but there's no test for it. The regex ^[A-Z_][A-Z0-9_]*$ does reject it, but an explicit test would lock this in.

7. Missing edge case: root path /

validateVolumePath("/") should fail since the regex requires at least one char after the leading /. No test for it.

test("rejects root path", () => {
  expect(() => validateVolumePath("/")).toThrow(/Invalid volume path/);
});

8. milady-create-routes.test.ts referenced in PR description but not in diff

The PR description lists bun test packages/tests/unit/milady-create-routes.test.ts as a validation step, but no changes appear in that file. Were new tests meant to be added there? Or is this just the existing test suite being run as a regression check?


Nits

  • The hasControlChars helper is function (non-exported), which is correct, but consider consolidating the redundant uses behind a single clear API if this pattern grows.
  • Error message for validateEnvValue mentions "PEM-encoded values are not supported" — this is useful user-facing wording but feels slightly out of place in a generic validation utility. Fine as-is.

Summary

The core security fix is correct and well-implemented. The centralized helpers, test coverage, and defense-in-depth double-validation are all solid. The two items I'd prioritize before merge:

  1. Document or guard the MAX_AGENT_ID_LENGTH regression (item 2 above)
  2. Add the missing edge-case tests for empty env key and root volume path (items 6–7)

The redundant hasControlChars checks (item 1) are harmless but worth cleaning up for consistency.

@claude
Copy link

claude bot commented Mar 22, 2026

First section of review posted

@claude
Copy link

claude bot commented Mar 22, 2026

Code Review

Good security hardening overall. The centralized validation approach is the right architecture — replacing ad-hoc inline checks with shared helpers reduces the chance of future bypass. Test coverage is solid. A few issues worth addressing:


Bug: validateContainerName one-char min vs. Docker's actual limit

The regex accepts 1-char names (e.g. 'm'), but Docker actually requires a minimum of 2 characters. The container name 'm' would be rejected at the Docker API level after passing your validator. Adjust the second group from {0,127} to {1,127} to enforce a 2-128 char range.


Redundant hasControlChars before regex in four validators

validateAgentId, validateEnvKey, validateContainerName, and validateVolumePath all call hasControlChars before a regex that already rejects those same characters. The guard is never false when the regex would pass. This is not wrong — it is defense-in-depth — but it could confuse future maintainers. Either remove the redundant calls or add a comment noting the intentional layering.


validateVolumePath misses PATH_MAX (informational)

Given that agentId is bounded by MAX_AGENT_ID_LENGTH = 121 and the prefix is /data/agents/ (13 chars), the max possible path is 134 chars — well within PATH_MAX (4095). No fix needed, but a comment here would guard against future prefix changes silently reopening this.


Missing test: empty string for validateContainerName

The suite covers shell metacharacters, trailing newlines, and length overflow, but not the empty-string case. The regex implicitly rejects it (the anchor requires at least one char), but an explicit test documents that expectation.


milady-create-routes.test.ts referenced in validation but not in the diff

The PR description lists that file as a validation step, but it is not in the diff. If the new validators propagate differently at the route level, integration coverage should be in the PR. If the file was run but unchanged, a note clarifying that would help reviewers.


Minor: validateEnvValue error message is misleading for non-PEM control chars

The message always says 'newlines and PEM-encoded values are not supported', but it fires for any control character including null bytes and tabs. A more precise message would reference control characters 0x00-0x1f and 0x7f rather than only mentioning newlines.


Nit: escaped hyphen in validateVolumePath regex

The backslash-hyphen is valid but the conventional form places the literal hyphen at the end of the character class. Avoids potential confusion with a character range.


Overall

The security intent is sound, the tradeoff (uppercase-only env keys, no control chars in values) is clearly documented, and validation is correctly applied before shell interpolation. The getContainerName/getVolumePath double-validation guardrail is a nice defensive touch. The Docker 1-char minimum discrepancy and the missing empty-string test are the only items worth fixing before merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant